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ABSTRACT 

We present predictions for the clustering of galaxies selected by their emission at far infra¬ 
red (FIR) and sub-millimetre wavelengths. This includes the first predictions for the effect 
of clustering biases induced by the coarse angular resolution of single-dish telescopes at 
these wavelengths. We combine a new version of the GALFORM model of galaxy forma¬ 
tion with a self-consistent model for calculating the absorption and re-emission of radiation 
by interstellar dust. Model galaxies selected at 850 pm reside in dark matter halos of mass 
-Mfiaio ~ — 10^^ h~^ Mq, independent of redshift (for 0.2 < z < 4) or flux (for 

0.25 < -S'sso/im ^ 4 mJy). At z ~ 2.5, the brightest galaxies {Ss50pm > 4 mJy) exhibit 
a correlation length of tq = 5.5('(Q g h~^ Mpc, consistent with observations. We show that 
these galaxies have descendants with stellar masses M* ~ 10^^ h~^ Mq occupying halos 
spanning a broad range in mass Mhaio 10^^ — 10^^ h~^ Mq. The FIR emissivity at shorter 
wavelengths (250, 350 and 500 pm) is also dominated by galaxies in the halo mass range 
Mhnio ~ — 10^^ h~^ Mq, again independent ofredshift (for 0.5 ^ ^ 5). We compare 

our predictions for the angular power spectrum of cosmic infra-red background anisotropies 
at these wavelengths with observations, finding agreement to within a factor of ~ 2 over 
all scales and wavelengths, an improvement over earlier versions of the model. Simulating 
images at 850 pm, we show that confusion effects boost the measured angular correlation 
function on all scales by a factor of ~ 4. This has important consequences, potentially leading 
to inferred halo masses being overestimated by an order of magnitude. 

Key words: cosmology: large-scale structure of Universe — galaxies: evolution — galax- 
ies:formation — galaxies: high-redshift — sub-millimetre: diffuse background — sub¬ 
millimetre: galaxies 


1 INTRODUCTION 

The emission from galaxies formed throughout cosmic history ap¬ 
pears as a diffuse cosmological background. The far infra-red (FIR) 
and sub-millimetre (sub-mm, 100 pm - 1 mm) part of this back¬ 
ground (CIB, e.g. [Puget et al.|1996l|Fixsen et al.|19^ is mostly 
produced by the re-emission of stellar radiation by interstellar dust, 
with a small (< 5%) contribution from dust heated by UV/X-ray 
emission from AGN (e.g.| Almaini, Lawrence & Boyle|1999||Silva,| 
[ Maiolino & Granato|2004| , and has a similar energy density to the 
background at UV/optical wavelengths (e.g. [Hauser & Dwek|2001| 
[Dole et al.|2006| >. This implies that most of the star formation over 
the history of the Universe has been obscured by dust. Understand¬ 
ing the nature of the galaxies that contribute to the CIB is therefore 
critical to a full understanding of galaxy formation. 

Much progress has been made in recent years to map the sky 
at these long wavelengths either from space, using satellites such 
as the Herschel Space Observatory ( jPilbratt et al.||2010t , its pre¬ 
decessor the Balloon-borne Large Aperture Sub-millimetre Tele- 
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scope (BLAST; Devlin et al.|2009 1 , and the Planck Satellit^ or 
from ground-based instruments, such as the Super Common User 
Bolometer Array 2 (SCUBA-2; [Holland et al.|2013) . However, one 
of the key problems with observations at these long wavelengths is 
confusion noise, caused by the coarse angular resolution [~ 20 arc- 
sec full width at half maximum (FWHM)] of the telescopes and the 
high surface density of detectable objects. This means that only the 
brightest objects can be resolved above the confusion background 
from imaging at these wavelengths. 

Whilst these individually resolved galaxies do not form the 
dominant contribution to the CIB (e.g. [Oliver et al.[2010[ l, they are 
important to study in their own right as they appear to be amongst 
the most highly star-forming objects in the Universe, as their 
FIR/sub-mm emission is thought to be powered by star formation, 
leading to inferred star formation rates (SFRs) of > 100 Mq yr“^ 


bank et al.||2014b. However, the use of gravitational lensing (e.g. 

Small, Ivison & Blain|1997[|Knudsen, van der Werf & Kneib 

2008 

Chen et al.|2013), stacking techniques (e.g. Bethermin et al. 

2012 

Geach et al. 2013J and interferometers (e.g.jHatsukade et al. 

2013 
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ICamiani et al.|[2015^ has to some extent circumvented the prob¬ 
lem of confusion noise and allowed up to ~ 80% of the CIB to be 
statistically resolved into galaxies. 

Placing these FIR/sub-mm galaxies into a consistent evolu¬ 
tionary context has proven challenging. In terms of resolved sub- 
mm galaxies (SMGs) it is still unclear what physical mechanism 
triggers the prodigious SFRs inferred from observations. In the lo¬ 
cal Universe {z < 0.3), the majority of ultra-luminous galaxies 
(Z/iR > 10^^ Lq) are gas-rich major mergers (e.g. Sanders & 
Mirabel fl996) , but whether this is the dominant triggering mech¬ 
anism at the peak of the SMG redshift distribution {z ~ 2.5, e.g. 
[Chapman et al.|[2005[ [Simpson et al.||2014[ l is unclear. Some dy¬ 
namical studies using emission lines from the *^CO molecule (e.g. 
[Tacconi et al.[2008| l and Ha (e.g. |Menendez-Delmestre et al.|2013^ 
have concluded that they see evidence of merger activity, though 
the sample sizes are small (< 10 objects). The merger scenario is 
also supported by some recent morphological studies (e.g. |Chen| 
[et al.[|20T5) . However, examples of rotationally supported discs 
have also been found (e.g. [Swinbank et al.[[2011^ suggesting the 
star formation was triggered by secular disc instabilities. Simula¬ 
tions suggest that the contraction of gas towards the centre of a 
galaxy, fuelling the star formation which results in the enhanced 
FIR/sub-mm emission (e.g.[Mihos & Hernquist|1996[ [Chakrabarti[ 


[et al.[20()^[Narayanan et al.|2010^ , could also cause accretion onto 

a supermassive black hole, with the resulting quasar phase quench¬ 


ing the star formation (e.g. |Di Matteo, Springel & Hemquist| 2005 


possibly resulting in compact quiescent galaxies (e.g. [Toft et ah] 
[2014[ l. It has been speculated that SMGs could then evolve onto 
the scaling relations observed for massive local elliptical galaxies, 
based on simple arguments involving the timescale of the burst and 
the ageing of the stellar population (e.g. [Lilly et al.|19^[Swinb ank[ 
[et al.[|200^ [Simpson et al.[[20T4l ), and assuming that most of the 
stellar mass at jz = 0 is put in place during the ‘SMG phase’. How- 
ever, [Gonzalez et ^ ( [2011^ present an alternative scenario in which 
SMGs evolve into galaxies with stellar mass ~ 10^^ h~^ M© at 
z — 0, with the SMG phase accounting for little of this stellar 
mass. 

An important constraint on any evolutionary picture can come 
from observational measurements of the clustering of selected 
galaxies, which provides information on the masses of the dark 
matter halos in which they reside. However, measuring the clus¬ 
tering of FIR/sub-mm galaxies has proven challenging. Some stud¬ 
ies have failed to produce significant detections of clustering (e.g. 


Scott et al. 2002[ [Webb et al.|2003[ [Coppin et al.|2006| [Williams] 


et al.|2011 , or the results derived from similar data have proven 

contradictory (e.g. [Cooray et al.|2010[ [Maddox et al.|20l0^ . Nev¬ 
ertheless, at 850 ^m |Hickox et al.[ ( [2012[ > used a cross-correlation 
analysis (e.g. [Blake et aH2006|> to find that SM Gs selected from 
the LES^^ source catalogue jWeiB et al.|2009 have a correlation 
length of ro = 3 Mpc. This result is consistent with an 

earlier study by [Blain et al.| ( [2004l l who used a pair-counting anal¬ 
ysis to show that SMGs selected from a number of SCUBA fields 
have a correlation length of 6.9 ± 2.1 h~^ Mpc. These correla¬ 
tion lengths are consistent with SMGs residing in halos of mass 
10^^ — 10'^® h~^ Mq. Both the Hickox et al. and Blain et al. 
studies were performed prior to interferometric observations, which 
showed that many single-dish sources are in fact composed of mul- 


^ Large APEX (Atacama Pathfinder Experiment) Bolometer Camera Ar¬ 
ray (LABOCA) Extended Chandra Deep Eield South (ECDES) Sub¬ 
millimetre Survey 


tiple, fainter galaxies (e.g. [Wang et al.|20TT| [Hodge et al.|20T3) l. It 
is currently unclear from previous work how this result affects the 
observed clustering of SMGs. We therefore present predictions for 
this in Sectionj?) 

Information about the clustering, and therefore host halo 
masses, of the unresolved FIR/sub-mm galaxies which contribute 
to the bulk of the CIB, can be obtained from the angular power 
spectrum of CIB anisotropies. The first attempts to measure this, 
by [Peacock et al.[j200Qt for the Hubble Deep Fi eld observed by 
SCUBA at 850 /^m, and Lagache & Puget 120001 for a 0.25 deg^ 
Infrared Space Observatory (ISO) field at 170 /rm, found at best 
only a tentative signal above the shot noise. More recently studies 
have been able to measure a clear signal (e.g. Viero et al. |2009[ 
Amblard et al. [2011| Viero et al. |2013| Planck Collaboration XXX 
[2014[ >, though significant modelling is required in order to interpret 
these results in terms of halo masses. The Viero et al. (2013) and 
Plank Collaboration studies infer the typical halo mass for galax¬ 
ies that dominate the CIB power spectrum as h~^ Mq 

and hr^ Mq respectively, making various assumptions 

such as the form of the relationship between galaxy luminosity and 
halo mass being independent of redshift, and that this relationship 
is the same for both central and satellite galaxies. 

Historically, hierarchical models of galaxy formation have 
struggled to simultaneously match the number density of FIR/sub- 
mm galaxies at high redshift (2 > 2) and the present day (2 = 0) 
luminosity function in optical and near-IR bands (e.g. |Granato et al.| 
|2000|l. It follows that theoretical predictions for the clustering, and 
host halo masses, of such galaxies are few. [van Kampen et ^ 
( [2005^ present a number of predictions for the angular clustering of 
SMGs under different scenarios. However, these models are phe¬ 
nomenological and do not attempt to predict the sub-mm flux of 
galaxies in a self-consistent manner. [Baugh et aL| ( [2005| l presented 
a version of GALFORM, the Durham semi-analytic model of hier¬ 
archical galaxy formation ( [Cole et ^[2000^ , which successfully 
reproduced the observed number counts and redshift distribution of 
SMGs at 850 pm as well as the 2 = 0 luminosity function in opti¬ 
cal and near infra-red bands. In order to do so these authors found 
it necessary to dramatically increase the importance of high red¬ 
shift galaxy mergers relative to earlier versions of GALFORM (e.g. 
[Cole et al.|2000[rBenson et al.|2003] ( through the introduction of a 
top-heavy IMF in starburst galaxies. In this instance sub-mm flux 
was calculated by combining GALFORM with the radiative transfer 
code GRASIL ( [Silva et al.[1998 [(. Predictions of the SMG clustering 
in this model were presented in [Almeida, Baugh & Lace^ ( [2011[ (, 
who found a correlation length of 5.6 ± 0.9 h~^ Mpc for galaxies 
with SsbOnm > 5 rnJy at 2 = 2, in good agreement with the subse¬ 
quent observational measurement of [Hickox et al.[ ( |2012[ (. The an¬ 
gular power spectrum of CIB anisotropies predicted by this model 
was presented in [Kim et al.[p012| ( and was within a factor of ~ 3 
of the measurements of the Planck Collaboration (XVIII, |201 1^ . 

Predictions for the clustering of FIR/sub-mm selected galaxies 
from hydrodynamical simulations of galaxy formation are limited 
due to the relatively small volumes that can ( currently) be simu- 
lated using this method ~ (100 h~^ Mpc)® (e.g. Vogelsberger et al. 


[2014[|Schaye et al.|2015[ l and the computational expense of the ra¬ 
diative transfer required to properly calculate the sub-mm fluxes 
of the simulated galaxies. Nevertheless, [Dave et al.[ ( |201Q[ ( used a 
hydrodynamical simulation to argue that 850 pm SMGs at 2 = 2 
should be a highly clustered population with a correlation length 
of ro ~ 10 h~^ Mpc and a bias of ~ 6. However, this work did 
not calculate the sub-mm flux for any of the simulated galaxies 
and instead relied entirely on the ansatz that SMGs are the most 
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highly star-forming galaxies at a given epoch, with a SFR selection 
limit chosen such that the number density of the simulated sample 
matched that of observed SMGs. 

Here we present predictions for the clustering, and host halo 
masses, of galaxies selected by total infra-red luminosity, and 
FIR/sub-mm emission. We use a new version of the GALFORM 
semi-analytic model ( [Lacey et al.||2015| henceforth LI5). This is 
combined with a simple model for the reprocessing of stellar ra¬ 
diation by dust in which the dust temperature is calculated self- 
consistently (as is done in e.g. [Gonzalez et al.|[2011[ [Kim et ^ 
[2012[ l. This paper is structured as follows: in Section|^we introduce 
the theoretical model, in Section we present predictions for the 
spatial clustering of galaxies selected by their total infra-red lumi¬ 
nosity (Lir), and by their 850 /im flux, in Section|^we make pre¬ 
dictions for the angular clustering of simulated galaxies selected by 
their 850 /rm flux, taking into account the effect of the single-dish 
beam used to make such observations, and in Sectionj^we present 
predictions for the angular power spectrum of CIB anisotropies at 
250, 350, and 500 /rm. We conclude in Section]^ 


2 THE THEORETICAL MODEL 

Here we introduce our model, which combines a dark matter 
only A^-body simulation, a state-of-the-art semi-analytic model of 
galaxy formation and a simple model for the reprocessing of stellar 
radiation by dust in which the dust temperature is calculated self- 
consistently based on radiative transfer and global energy balance 
arguments. We also briefly describe some of the physical properties 
of the dusty star-forming galaxies in this model. 


2.1 GALEORM 


The Durham semi-analytic model of hierarchical galaxy formation, 
GALFORM, was introduced in [Cole et al. (2000 1, building on ideas 
outlined by White & Rees[(T9^ i. White & Frenk[( |T991[ ( and [Cole| 
[et al.[ ( [1994| . Galaxy formation is modelled ab initio, beginning 
with a specified cosmology and a linear power spectrum of density 
fluctuations and ending with predicted galaxy properties at different 
redshifts. 

Galaxies are assumed to form within dark matter halos, with 
their subsequent evolution controlled in part by the merging his¬ 
tory of the halo. These halo merger trees can be calculated using 
a Monte Carlo technique following the extended Press-Schechter 
formalism (e.g. [Parkinson, Cole & Helly[[2008^ , or extracted di¬ 
rectly from a dark matter only A^-body simulation (e.g. [Helly et al.[ 
[2003[ [Jiang et ^[2014[ (. For this work we use halo merger trees 
derived from a Millennium-style dark matter only A^-body sim¬ 
ulation ( [Springel et al.[[2005[ [Guo et ^[2013^ , but with cosmo¬ 
logical parameters consistent with th e 7-year Wilkinson Mic rowave 
Anisotropy Probe (WMAP7) results (Komatsu et al. 201 iP hence¬ 
forth referred to as MR7. This simulation has a volume of (500 
h~^ Mpc)® and a minimum halo mass of 1.86 x 10^° h~^ Mq, 
slightly higher than the value for the original Millennium simula¬ 
tion (1.72 X lO'^® h~^ Mq). Throughout this work we use the halo 
merger trees and halo masses as defined by the ‘Dhalo’ algorithm 
( [Jiang et al.[2014[ l. 

Some studies have shown that including baryonic processes 


3 Ho = 0.272, Ao = 0.728, h = 0.704, Ho = 0.0455, ag = 0.81, 
ns = 0.967. 


(e.g. AGN feedback) in A-body simulations can affect the matter 
power spectrum by < 10% for scales A < 5 h~^ Mpc at 2 = 0 
when compared to that of the dark matter only counterpart, due 
to the redistribution of gas on these scales (e.g. [van Daalen et al.j 
|20TT] |. We note that this effect is not modelled here. However, we 
are confident that our science results are robust to this as we are 
mostly concerned with the clustering of galaxies on larger scales. 

In GALFORM, the baryonic processes thought to be impor¬ 
tant for galaxy formation are included as a set of continuity equa¬ 
tions which essentially track the exchange of mass between stel¬ 
lar, cold disc gas and hot halo gas components. The parameters in 
these equations are then calibrated against a broad range of data 
from both observations and simulations. Stellar population synthe¬ 
sis models (e.g. [Bruzual & Charlot|2003][Maraston[2005[ l are used 
to calculate stellar luminosities. For a more detailed description of 
the semi-analytic method see the reviews of [Baugh| ( [2006^ an d|B en-| 
[son[ ( [2010| (. 

Various GALFORM models exist in the literature. For this work 
we use a new model (L15) which incorporates a number of im¬ 
portant physical processes from earlier models and can reproduce 
an unprecedented range of observational data. The physical pro¬ 
cesses modelled include a prescription for radio-mode AGN feed¬ 
back ( [Bower et al.[2006) in which quasi-hydrostatic hot halo gas is 
prevented from cooling by energy input from relativistic jets, and 
an improved star formation law in galaxy discs based on an empiri¬ 


cal relation between star formation rate and molecular gas (Blitz & 


[Rosolowsky|2006) first implemented in GALFORM by [Lagos et al. 
(201 1^. There is also a mode of star formation which takes place 


in a galactic bulge, triggered by either a disc instability or a galaxy 
merger. Following such an event, the cold gas component in the 
galactic disc (formed through the cooling of hot halo gas) is trans¬ 
ferred to a bulge/spheroid and a star formation law in which the 
star formation rate timescale scales with the dynamical time of the 
bulge is used, until this gas is exhausted. This transfer of gas to the 
bulge also results in accretion onto a galaxy’s central supermassive 
black hole (SMBH). Throughout we use the term ‘starburst’ to re¬ 
fer to a galaxy undergoing bulge star formation, and ‘quiescent’ to 
mean one in which star formation occurs only in the disc. We note 
that these definitions do not necessarily align with, for example, 
those based on a galaxy’s position on the SFR-M* plane. This is 
discussed in more detail in a forthcoming work (Cowley et al. in 
preparation). 

A feature of the L15 model important here is the inclusion of 
a top-heavy stellar initial mass function (IMF) for star formation in 
bursts, which allows the model to reproduce the observed number 
counts of galaxies selected at a range of FIR/sub-mm wavelengths 
(250 — 1100 /rm, [Cowley et al^[2015[ L15) though a much less 
extreme IMF slope is used here than was advocated in [Baugh et al.[ 


Kennicutt 


1 1983 I IMF is used in 


( 20051*^ A solar neighbourhood 
disc (quiescent) star formation. 

We note that we do not vary any of the fiducial L15 model 
parameters for this work and as such the results presented here can 
be considered as true predictions of the model, as it was calibrated 
without considering any clustering data. 


For an IMF described by dn/d In M* oc , a; = 1 in L15 whereas 
a; = 0 was used in Baugh et al. For reference, a Salpeter jl955[ IMF is 
described by a; = 1.35. 


© 2015 RAS, MNRAS 000,[TpTj 




































































4 


W. I. Cowley et al. 


2.2 The Dust Emission Model 

To determine a simulated galaxy’s FIR/sub-mm flux, a model is 
required to calculate the absorption and re-emission of stellar ra¬ 
diation by interstellar dust. Flere, a simple model is used which 
assumes dust exists in two components, each with its own temper¬ 
ature: (i) dense molecular clouds of uniform density in which stars 
are assumed to form and (ii) a diffuse interstellar medium smoothly 
distributed throughout a double exponential disc. 

The energy of stellar radiation absorbed by each component 
is calculated by solving the equations of radiative transfer in this 
simple geometry. The dust emission is then calculated using global 
energy balance arguments, assuming the dust emits as a modified 
blackbody. Importantly this means that the dust temperature is not 
a free parameter, but is calculated self-consistently for each dust 
component in each galaxy. The model is therefore capable of mak¬ 
ing bona-fide multi-wavelength predictions without having to as¬ 
sume a shape for the spectral energy distribution (SED) of the dust 
emission. 

Despite its simplicity, the model is able to accurately repro¬ 
duce the predictions of the more sophisticated radiative transfer 
code GRASIL ( |Silva et al.||1998) for Arest ^ 70 /rm, thus we are 
confident in its application to the wavelengths under investigation 
in this paper. For more details regarding the dust emission model 
we refer the reader to|Cowley et al.|([2015|l and the appendix of L15. 


2.3 The Nature of Dusty Star-Forming Galaxies in the L15 
model 

Here we give a brief description of the properties of the dusty star¬ 
forming galaxies which dominate the CIB and SMG population in 
the L15 model, in order to aid the reader in understanding results 
presented later. 

Dusty star-forming galaxies are predicted to be predominantly 
starburst galaxies (i.e. star formation occurs within the bulge), with 
the starburst phase being triggered by secular disc instabilities. The 
importance of disc instabilities in the model is twofold: (i) they re¬ 
sult in faster gas consumption at higher redshifts by triggering star- 
bursts, and (ii) they are the dominant channel in the model for the 
growth of supermassive black holes which allow AGN feedback to 
suppress star formation in massive halos (Mhaio ^ 10^^ h~^ M©) 
at late times. This means that the model displays the requisite star 
formation at early times to reproduce the redshift distribution of 
sub-millimetre galaxies at z > 1 without overestimating it at lower 
redshifts. 

Dusty star-forming galaxies are mostly central galaxies. In the 
model, instantaneous ram-pressure stripping of the hot gas halo is 
implemented when a galaxy becomes a satellite (its hot halo gas 
component is transferred to that of the parent halo) and it is as¬ 
sumed that no more gas will accrete onto the disc of the satellite 
galaxy. For this reason, the star formation in satellite galaxies is re¬ 
duced due to their diminishing gas supply, and they form a minor 
proportion (< 5%) of the dusty star-forming population. 

Here we present some of the physical properties of the dusty 
star-forming galaxy population in the LI5 model, the illustrative 
values presented are the median values for the Lir > 10 ^^ h~^ Lq 
population at 2 ; = 2.6. Dusty star-forming galaxies are amongst 
the most massive galaxies in the simulation at a given epoch with 
stellar masses M* ~ 2 x 10^'^ h~^ M©, and they reside in 
dark matter halos most conducive to star formation in the model 
(A/haio ~ 10^^'® h~^ M 0 ). They also have high star formation 
rates ~ 140 h~^ M© yr“^, translating to specific star formation 


rates of ~ 8 Gyr“^ (approximately 10 x the sSFR of the model’s 
‘main sequence’), dust to stellar mass ratios, Mdust/AF* ~ 0.03 
and molecular gas fractions Mcoid,moi/(Afcoid,moi -f Af*) ~ 0.4. 


3 THE SPATIAL CLUSTERING OF DUSTY 
STAR-FORMING GALAXIES 

In this section we present predictions for the spatial clustering 
of simulated galaxies selected by their total infra-red luminosity, 
Lir, and their emission at 850 /rm. We discuss how the cluster¬ 
ing evolves with redshift, how this relates to the dark matter halos 
the selected objects occupy, and how the populations selected by 
Lir and S'ssoMm are related. We also briefly discuss the stellar and 
host halo mass of the 2 = 0 descendants of the 850 /rm selected 
galaxies. 

We present the predictions of our model in this section with¬ 
out considering any observational effects, such as the angular reso¬ 
lution of the telescopes used to identify galaxies at sub-mm wave¬ 
lengths, redshift-space distortions, the accuracy of observed red¬ 
shifts or any selection biases such effects can introduce. Some of 
these issues are dealt with in Section!?] 


3.1 The Two-Point Spatial Correlation Function 


We quantify the clustering of our selected galaxies by use of the 
two point spatial correlation function ^(r), which is defined as the 
excess probability of finding two galaxies at a given separation r > 
0 , compared to a random distribution: 

5Pi2{r)=n^\l + i{r)]5Vx5V2, ( 1 ) 


(e.g. |Peebles||l980^ where n is the mean number density of the 
selected galaxies at a given redshift and SVi is a volume element. 
The two-point correlation at r = 0 is described by a Dirac delta 
function 5° (r )/n (referred to as the shot noise term) as the galaxies 
are treated as point objects. 

On large scales the correlation function is shaped by the clus¬ 
tering of galaxies in distinct dark matter halos, referred to as the 
two-halo term (e.g. jCooray & Sheth||2002[ [Berlind & Weinberg 
|2002| >. On these scales the correlation functions of the dark mat¬ 
ter and galaxies have a similar shape but differ in amplitude. This 
difference in amplitude, or bias, is defined as 


b{r) = 


CDM(r) 


( 2 ) 


Although galaxy bias is scale dependent (e.g. jAngulo et al.|2008^ it 
is usually approximated as constant on large scales, where it is gov¬ 
erned by a weighted average of the bias values over the halos that 
are occupied. The effective bias of the selected galaxy population 
can then be written as 


/ 6(M)n(M)(ArgaliM)dM 
/n(M)(iVgai|M)dM 


where b{M) is the bias of halos with mass M, n{M) is the halo 
mass function such that n{M)dM describes the comoving number 
density of halos in the mass range [M, M + dM], and (Agai|M) is 
the mean of the Halo Occupation Distribution (HOD, the expected 
number of selected galaxies within a halo of mass M). 

We measure the correlation function in the simulation volume 
using the standard estimator (e.g.|Peebles|1980)): 


^(r) = 


DD{r) 

WgainAI/(r)/2 


(4) 
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where DD(r) is the number of distinct galaxy pairs with sepa¬ 
rations between r ± Ar/2, Agai is the total number of selected 
galaxies, n is their mean number density and ISV (r) is the volume 
of the spherical shell between r ± lS.r/2. We make use of the peri¬ 
odic nature of our simulation to calculate this volume analytically. 

We calculate errors using the volume bootstrap method ad¬ 
vocated in |Norberg et al.| ( f2009| l. We divide our simulation volume 
into Wsub = 27 subvolumes and for each bootstrap realisation draw 
SNaub subvolumes at random (with replacement). As our volume is 
no longer periodic due to the spatial sampling we calculate ^(r) for 
each bootstrap realisation using the estimator presented in |Landy| 
|& Szalay| ( [T9^ : 

DD{r)-2DRir) + RRir) 

~ RR{r) ’ ^ ’ 

where DD{r), DR{r) and RR{r) represent the number of data- 
data, data-random, and random-random pairs with separations be¬ 
tween r ± Ar/2. For each bootstrap realisation we generate a ran¬ 
dom catalogue with 10 times more points than there are galaxies in 
our initial sample, normalising the DR and RR terms in equation 
0 to have the same total number of pairs as DD. We calculate 
100 bootstrap realisations from which we derive the Icr percentile 
variation for each bin of separation. 


3.2 Spatial Clustering Evolution of Infra-red Luminous 
Galaxies 

Here we present predictions for the clustering of galaxies selected 
by their total infra-red luminosity Lir, derived by calculating the 
energy of stellar radiation absorbed by dust through solving the 
equations of radiative transfer in our assumed dust geometry. 

We show the model predicted spatial clustering for galaxies 
selected by their Lir in Fig.f^at a selection of redshifts, 2 ~ 0 — 5, 
and luminosities, Lir ~ 1(P — 10^^'® h~^ Lq. For clarity, we 
only show volume bootstrap errors for the most luminous (i.e. least 
numerous) population. We are confident that our selected galaxies 
are complete populations, at all redshifts considered here, and are 
not affected by the finite halo mass resolution of our simulation. 
We also plot the correlation function of the dark matter, calculated 
using a randomly chosen subset of 10 ® dark matter particles from 
the MR7 simulation, and can see that the selected galaxy popula¬ 
tions represent biased tracers of the underlying matter distribution. 
Note that we do not show ^(r) of the most luminous population at 
2 < 1 as the number of pairs of such objects in our simulation at 
these redshifts is not sufficient to provide a robust prediction. 

It is notable that the clustering of the selected galaxies shows a 
dependence on the selection luminosity, and redshift. This is sum¬ 
marised in Fig. which shows the redshift evolution of the co¬ 
moving correlation length, ro, defined such that ^(ro) = 1 , and the 
large scale bias of the selected populations. In the right panel of 
Fig .[^we show for reference the large-scale bias evolution of halos 
selected by their mass, calculated directly from the MR7 simula¬ 
tion. 

At all redshifts shown the two fainter luminosity populations 
are predominantly composed of quiescently star-forming galaxies, 
they display a similar clustering evolution, though systematically 
offset such that the brighter of these two populations is more clus¬ 
tered at all redshifts. The brighter two populations are predomi¬ 
nantly composed of starburst galaxieQand display a different clus- 


® The luminosity at which the infra-red luminosity functions predicted hy 


tering evolution to the fainter two samples, with ro displaying a 
monotonic relationship with redshift. 

Comparing with the large-scale bias evolution of mass- 
selected halos we can see that our most luminous population dis¬ 
plays an evolution consistent with them residing in halos of mass 
10^^ — 10^^ h ~^ Mq over the redshift range 2 ~ 1 — 5. 

These results can be understood better in the context of the 
halo masses sampled by the infra-red luminosity selection. In Fig.|^ 
we show the distribution of galaxies in the star formation rate - halo 
mass plane for all galaxies (left panels) and for the infra-red lumi¬ 
nosity selected populations (right panels). We can see that the dis¬ 
tribution of SFRs is broad for halo masses Mhaio > 10^^ h~^ Mq 
and that the infra-red selections pick up a broad range of halo 
masses. We also see how this distribution evolves. At 2 = 4.2 the 
infra-red selection means that samples with increasing Lir have 
increasing median halo masses, leading to them being more biased 
than samples selected by a lower infra-red luminosity. At 2 = 1.5 
this is no longer the case, as the most luminous population has a 
slightly lower median halo mass than the next most luminous. This 
breaks the monotonic relation of increasing bias with increasing 
luminosity seen at higher redshifts. 

In Fig. [^we also com pare our predictions to the observational 
estimates of Dolley et al.|pbl4^ , who used far infra-red luminosi¬ 
ties derived from 24 /rm fluxes. We show the ro values for their 
redshift bins that are complete in infra-red luminosity, for clarity 
showing only most and least luminous samples within each red¬ 
shift bin. The colour scale indicates the mean infra-red luminosi¬ 
ties of their samples, the bins for which have a width of 0.25 dex in 
Lir. Whilst the overall agreement is generally favourable, Dolley 
et al. find, in contrast to our predictions, that for 2 < 1 at a fixed 
redshift ro increases with increasing luminosity. The model also 
appears to underpredict the clustering of ~ 10^^'® h~^ Lq galax¬ 
ies at 2 ~ 1 and overpredict the clustering of ~ 10^®'® Lq 
galaxies at 2 ~ 0.3. 

There could be a number of reasons for this discrepancy. Dol¬ 
ley et al. assumed a power-law slope of 7 = 1.9 in order to de¬ 
rive a correlation length. If a lower value is used (as favoured 
by our model) they note that this increases their estimated cor¬ 
relation lengths (e.g. assuming 7 = 1.8 gave correlation lengths 
~ 0.5 Mpc larger). Our model shows a variation of power-law 
slope with redshift and infra-red luminosity, with lower luminosity 
samples having generally flatter slopes. It is also unclear whether 
the simulated galaxies follow the relation used by Dolley et al. to 
derive Lir from the observed 24 fim photometry, which is based 
on templates derived from local galax ies ^Rjeke^ljL 2009|l an d ad¬ 
justed at higher redshifts according to |Rujopakam et ah ( 2013^ . Al¬ 
ternatively, further investigation into the physical processes which 
produce the distribution of galaxies on the SFR-Mhaio plane as pre¬ 
dicted by the model (Fig.[^ is required to understand how the pre¬ 
dicted clustering could be brought into better agreement with the 
Dolley et al. results. 

Our predictions for correlation length in this section are lower 
than the observational estimates of |Farrah et al.| ( [2006t , who infer 
correlation lengths of 9.4±2.2 and 14.4±2.0 h~^ Mpc for galaxies 
at 2 ~ 1.7 and 2.5 respectively, with Lir > 5 x 10^^ h ~^ Lq. 
However, we do not consider this a significant discrepancy, due to 
the complicated selection criteria of the Farrah et al. sample, which 


our model become dominated by starburst galaxies is dependent on redshift. 
For example, at 2 = 0 the luminosity function is dominated by starbursts for 
Lir > 10^^'® h~^ Lq, at 2 = 4.9 this limit is Lir > 10®® ® h~^ Lq. 


© 2015 RAS, MNRAS 000, [TpT] 




















6 W. I. Cowley et al. 



Figure 1. Main panels: The predicted two-point spatial congelation function, ^(r), as a function of comoving separation, r, for galaxies selected by their total 
8—1000 /jm luminosity, Lir, at the redshift indicated in each panel. The cyan, blue, red and green lines show galaxies with Ljjr = 10® —10®'®, 10^® —10^® '®, 
10^^ —10^^ ® and 10^® —10^®'® Lq respectively. The green shaded region shows the Icr volume bootstrap errors for the Lm = 10^® —10^® ® /i“® Lq 
population. The black line indicates the correlation function measured for dark matter particles in the MR7 simulation. The vertical and horizontal dashed grey 
lines are drawn for reference at r = 5 h~^ Mpcand^ = 1 respectively. The diagonal black dash-dotted line, again for reference, indicates a 7 = —l.Spower 
law with a correlation length of 5 h~^ Mpc. Sub panels: As for the main panels but indicating the bias, defined as A horizontal grey dashed 

line indicating a bias value of 1 is drawn for reference in each panel. 


we do not attempt to model here, and assumptions made by those 
authors regarding the redshift distribution of their sample, and their 
parametrisation of ^(r, z). 

3.3 Spatial Clustering of SMGs 

In this section we present the spatial clustering of galaxies selected 
by their 850 pm flux. We focus on this wavelength as it is histori¬ 
cally the wavelength at which the majority of ground-based obser¬ 
vations of FIR/sub-mm galaxies have been performed, due to the 
atmospheric transmission window. The real space two-point cor¬ 
relation function and large-scale bias for our selected galaxies are 
presented in Fig.|^over a range of redshifts which span the peak of 
the redshift distribution of the selected SMGs. 

We consider three samples of galaxies selected by flux: (i) 
a bright population with Sgso/jm > 4 mjy (median Lir, ~ 


10^®'® /i“® Lq at 2: = 2.6, green line) as this is a typical limit 
at which single-dish surveys can detect SMGs (e.g. |Wei6 et al.| 
|2009| though note we do not consider the effects of the single¬ 
dish beam in this section), (ii) an intermediate population with 
Sssonm > 1 mJy (median Lir ~ /i“® L© at 2 = 2.6, 

red line) as this is an approximate limit to which ALMA detected 
galaxies as part of Cycle 0 observations (e.g. |Hodge et al.|2013| l 
and (iii) a faint population with SgsoMm > 0.25 mJy (median 
Lir ~ 10^^'® /i“® L© at 2 = 2.6, blue line) which are in prin¬ 
ciple detectable by ALMA, though with longer integration times 
and more antennae than were used in Cycle 0. Our selected galax¬ 
ies exhibit clustering with ro ~ 5 h~^ Mpc, with little dependence 
on flux, for the fluxes considered here. 
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Figure 2. Left panel: Evolution of the comoving coiTelation length ro [defined such that ^(ro) = 1 ]. The cyan, blue, red and green lines show galaxies 
with Li^ = 10® — 10®'^, 10^® — 10^®'^, 10^^ — and 10^® — h~^ Lq respectively. T he erro rs indicate la volume bootstrap errors for 

the Lin = 10 ^® — h~^ Lq population. A selection of observational estimates from Dolley et al. { 2014 } are shown as circles, with the colour scale 

indicating the mean Lm for each sample, as shown on the inset colour bar. Right panel: As for the left panel, but indicating the evolution of the large scale 
bias. The dotted, dashed and dash-dotted lines indicate the bias evolution for halos of > 10^^, 10^^ and 10^^ h~^ Mq respectively, as measured directly 
from the MR7 simulation. 
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Figure 3. Predicted distribution of galaxies in the star formation rate - halo mass plane at 2 : = 4.2 (top panels) and 2 : = 1.5 (bottom panels). Left panels: 
Distribution of all galaxies with the shading representing the galaxy number density at that position on the plane, with red indicating the highest number 
densities and purple the lowest. Open circles show the median SFR in bins of halo mass, with the errorbars indicating the 16 — 84^^ percentile scatter. Right 
panels: Distribution of galaxies selected by their total infra-red luminosity for luminosities of 10^® — 10^^'^ (green), 10^^ — 10^^'^ (red), 10^® — 10^®'^ 
(blue) and 10® - 10®-^ /i"® Lq (orange contours). The open symbols indicate the median halo mass and SFR in the corresponding luminosity bin, with the 
errorbars indicating the 16 — 84**^ percentile scatter in halo mass. 


3.3.1 SMG Halo Occupation Distribution 

We can gain further insight into the clustering of the selected SMGs 
from Fig. 1^ which shows their halo mass probability distribution 
(i.e. the product of the halo mass function and the mean of the 
HOD - n(m)(Wgai|M) in equation [4 left panels) and the mean 
of the HOD ((Wgai|M) in equation^ right panels) at redshifts 
z — 3.1 and 2.1 (top and bottom panels respectively). It is evi¬ 
dent from the left panels that SMGs reside predominantly in ha¬ 
los of mass ~ ® — 10^^ h~^ Mq, the halo mass range most 


conducive for star formation in our model over a broad range of 
redshifts (see Fig. 27 of L15). For example, at z = 3.1: 87, 74 
and 54% of galaxies in the Ssso/im > 4, 1 and 0.25 mjy selected 
populations respectively reside in halos within this mass range. At 
z = 2.1 these percentages are 75, 69 and 53% respectively. The 
halo mass at which the probability distribution peaks seems insen¬ 
sitive to the 850 fxm flux of the galaxies and their redshift, although 
fainter galaxies do occupy a broader range of halo masses, and the 
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Figure 4. Top panels: The spatial coirelation function for 850 pm selected galaxies at redshifts of 1.5, 2.5 and 3.5 (left to right). The blue, red and green 
lines show the correlation function for SssOum > 0.25, 1.0 and 4.0 mly respectively. The green shaded region shows the Icr volume bootstrap errors for the 
SssOn-in > 4.0 mly population. The black line indicates the correlation function measured for dark matter particles in the MR7 simulation. The vertical and 
horizontal dashed grey lines are drawn for reference at r = 5 h~^ Mpc and § = 1 respectively. The diagonal black dash-dotted line, again for reference, 
indicates a 7 = —1.8 power law with a con'elation length oi5 h~^ Mpc. Bottom panels: As for the top panel but indicating the bias, defined as 
A horizontal grey dashed line, drawn for reference in each panel, indicates a bias of 1. A horizontal black dotted line, again drawn for reference, indicates a 
bias of 1.7, 2.9 and 4.2 (left to right). 


distribution for satellite galaxies (dashed lines) peaks at a higher 
halo mass (~ 2 x 10^^ h~^ Mq). 

In the right panels the HODs for central galaxies (dotted lines) 
peak below unity for all samples. The HODs only reach unity for 
satellites in fainter samples in massive halos (Mh > 10^® h~^ Mq 
at z = 2.1). Models which force (A^smGs.c) = 1 and adopt the 
same number density of SMGs would place them in more massive 
halos than predicted by our model. An S'gsopm > 1 mjy galaxy is 
hosted in roughly 1 in every 10 halos of ~ 10^^ h~^ M©, show¬ 
ing the need for a large number of halo histories to be sampled (i.e. 
large cosmological volumes simulated) in order to make robust pre¬ 
dictions for the SMG population as a whole (see also e.g.|Almeida,| 
[Baugh & Lacey|2011[priler et al.|20l5) ). 

We attribute the minima in the HODs for the central galaxies 
to merger-induced SMGs. In our model AGN feedback becomes 
effective in massive haloes (Mpaio ^ 10^^ h~^ Mq), which pre¬ 
vents hot halo gas from cooling, limiting the fuel for star formation 
and leading to the downturn in the HOD. Galaxy mergers bring in a 
fresh reservoir of cold gas to central galaxies, allowing further star 
formation in these high mass (> h~^ Mq) halos without the 

need for in-situ gas cooling. 


3.3.2 The Evolution of SMG Clustering 

We show the evolution of the correlation length ro in the left panel 
of Fig. 1^ This is approximately constant for z < 2 but increases 
with increasing redshift at higher redshifts. The errorbars shown 
are derived from the Icr bootstrap errors described above. 

In the right panel of Fig.|^we show the evolution of the large 
scale bias with redshift, in addition plotting for reference the evo¬ 
lution of the large-scale bias for halos selected by their mass. We 
can see that the bias evolution of our galaxies is of a similar form 
to that of the halos, indicating that SMGs typically reside in halos 


of — 10^^ h ^ Mq, over a large redshift range. This is in 
agreement with our previous findings in Fig.|^ 

In Fig. 1^ we compare to the observational results of jHicko)^ 
jet al.j P012f ~and [Blain et al.j j2004[ >. Hickox et al. use sub-mm 
sources from the single-dish LESS source catalogue jWeiB et'aF] 
|2009[ >, with S'sso^tm ^ 4.5 mJy, at redshifts of 2 ~ 2 — 4, cover¬ 
ing 0.35 deg^, and use the cross-correlation of these with IRAC se¬ 
lected galaxies over a similar redshift range, taking into account the 
photometric redshift probability distribution of their SMGs ( |Ward-| 
[low et ar]|2011| ), to derive a large scale bias of 3.4 ± 0.8 from 
which they find a correlation length of ro = 7. 7^2 3 Mpc 
assuming a power-law correlation function [^(r) = (r/ro)”^] 
with 7 = 1.8. Blain et al. also assume a power-law ^(r) with 
7 = 1.8, and a Gaussian redshift distribution ([Chapman et al.j 
[2005 [ 1 , whilst allowing ro to vary in order to match the number of 
SMG (^sso/jm ^ 5 mJy) pairs observed across a number of non¬ 
contiguous SCUBA fields with a combined area of ~ 0.16 deg^. 
They obtain a correlation length ofro = 6.9±2.1/i~^ Mpc but 
note that if they exclude the most overdense field from their analy¬ 
sis, they derive ro = 5.5 ±1.8 h~^ Mpc, which is in better agree¬ 
ment with our predictions. However, due to the significant errors 
on the observational data and potential biases due to the single-dish 
beam used in these studies which we discuss in Section]^ it is dif¬ 
ficult to draw any strong conclusions about the level of agreement 
between the model and data. 

From comparing the left panel of Fig. I^to that of Fig.|^ we 
can see that the clustering evolution of our SMG populations are 
remarkably similar to that of our most infra-red luminous galax¬ 
ies (Lir = 10^^ — 10^^'® h~^ Lq). We note that at 2 = 2.6 the 
median 850 pm flux for galaxies in our most luminous Lir bin 
(10^^ — 10^^'® h~^ Lq) is 3 . 3 I 1 g mJy, where the errorbars rep¬ 
resent the 10 — 90 percentiles. Conversely, at the same redshift the 
‘S'sso/jm > 4 mJy population has a bolometric dust luminosity of 
Lir = 10^^ °'^ — 10^^'"^'^ h~^ L@ (10-90 percentiles). Thus in our 
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Figure 5. Probability distribution of halo mass (left) and halo occupation distribution (right) for 850 /rm selected SMGs at 2 = 3.1 (top) and 2.1 (bottom). The 
blue, red and green lines indicate the HOD for the Sssomiu > 0.25, 1.0 and 4.0 mJy respectively, with the dashed (dotted) lines depicting satellite (central) 
galaxies. A horizontal dash-dotted line is drawn in both right panels at (NgMGs) = 1 for reference. 


model the 850 /im selection selects the most infra-red luminous 
starburst galaxies (our predicted galaxy number counts at 850 fim 
are dominated by starburst galaxies for Sgso/jm ^ 0.2 mJy), hence 
the similarities in the model predicted clustering evolution of SMGs 
and the most infra-red luminous galaxies. 


3.3.3 SMG Descendants and Environment 

Arguments which assume that the majority of 2 = 0 stellar mass of 
an SMG descendant is formed during the sub-mm bright phase im¬ 
ply that by fading the stellar population, SMGs could evolve onto 
the 2 = 0 scaling relations of massive ellipticals (assuming a burst 
duration of typically ~ 100 Myr, e.g.|Swinbank et al.|2006||Simp-| 
[son et all2014^ . Here we investigate the stellar and halo masses 
of the 2 = 0 descendants, presenting our findings for the bright 
population (S'ssoMm > 4 mJy) in Fig.|^ 

We find that across all redshifts shown in Fig. which span 
the majority of the redshift distribution for this population, the 
selected galaxies evolve into galaxies with a stellar mass of ~ 
10^^ h~^ Mq at the present day. This is similar to the results pre¬ 


sented from an analysis of an earlier version of the galaxy forma¬ 
tion model used here jGonzalez et al.|2011t. 


The stellar masses of SMGs inferred from observations are the 
subject of much debate. They are typically inferred by SED fitting 
to broadband photometry, making a range of assumptions regard¬ 
ing the AGN contamination, dust obscuration, star formation his¬ 
tory and IMF of the galaxies in question. Early estimates appeared 
to disagree by factors of ~ 5 — 10 for the same sample of SMGs. 
Hainline et al.|l20TT|l quoted a median stellar mass for the |Chap^ 


man et al. 


_ j2Q05 I sample (Ssso/jm > 5 mJy) of ~ 5 x 10^° h Mq 

[assuming a |Krou^ ( |2002| l IME] in contrast to the higher value of 


2.6 X 10^^ h~^ M© found by |Michalowski, Hjorth & Watson 
( |2010^ [assuming a Chabrier ( |2003| IME], though subsequent work 
by [Michalowski et al.| ( |201^ suggested that this discrepancy was 
mostly due to the assumed star formation histories used by the two 
studies, once differences due to the choice of IME were taken into 
account. [Michalowski et al.]p012[ l also obtained a revis ed median 
stellar mass of ~ 1.4 x 10“ h~^ MrT). More recently Ida Cunha 


et al. 


12015 I derive a median stellar mass of <■ 


recently 
6 X 10^'^lr^ 


Mr 
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Figure 6. Left panel: Evolution of the comoving correlation length ro [defined such that ^(ro) = 1] with redshift, for galaxies with SsbOn-'m. > 0.25,1.0 and 
4.0 mJy (blue, red and green lines respectively). The errors indicate Icr volume bootstrap en'ors for the S'gsoAim > 4.0 mJy population. The observational 
data are taken from Hickox et al. j2012| squares) and Blain et al. j2004| triangles). Right panel: Symbols and coloured lines as for the left panel but indicating 
the evolution of the large scale bias. The dotted, dashed and dash-dotted lines indicate the bias evolution for halos of M^aio > 10^^, 10^^ and 10^^ h~^ Mq 
respectively, as measured directly from the MR7 simulation. 


10^5 

T 


Figure 7. The descendants of S'ssoAim > 4.0 mJy selected galaxies in 
our simulation. The squares and triangles indicate the median stellar and 
host halo mass of the selected galaxies respectively, with the filled sym¬ 
bols indicating this quantity at the redshift of interest and the open symbols 
indicating this quantity for the 2 = 0 descendant. The en'or bars indicate 
10 — 90 percentile ranges. The open squares and filled triangles are off¬ 
set in redshift by ±0.025 for clarity. A dotted horizontal line is drawn at 
M = 10^^ h~^ M 0 for reference. 
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slightly lower masses, for example the Ssso^m > 0.25 mJy popu¬ 
lation evolve into galaxies with stellar mass ~ 5 x 10^° h~^ M©. 
Note also that here we consider unique descendants, such that if 
two galaxies selected at a given redshift evolve into the same de¬ 
scendant at 2 = 0 it is only counted once. 

In terms of halo mass, whilst sub-mm selected galaxies occupy 
a relatively narrow range of halo masses (~ 0.5 dex) at the redshift 
at which they are selected, the range of halo masses which host 
the z = 0 descendants is broad, spanning nearly two orders of 
magnitude ~ 10^^ — 10^'* h~^ M©. In our model it appears then 
that bright SMGs do not necessarily trace the most massive 2 = 0 
environments. As with stellar mass, here we consider unique halos, 
such that if a halo contains two galaxies selected at a given redshift, 
or the 2 = 0 descendant(s) of two galaxies selected at a given 
redshift, it is only counted once. 

Our results for stellar and halo masses of bright SMGs and 
their descendants are a factor of ~ 5 lower than those found by 
[Munoz Arancibia et al.| ( [2015t . However, their simulations do not 
self-consistently predict the sub-mm flux of galaxies as is done in 
this work, but instead rely on a ‘count-matching’ approach to link a 
galaxy’s physical properties to its sub-mm flux. They infer median 
stellar and halo mass of and h~^ M© respectively for 

SMGs; and and h~^ M© respectively for the 2 = 0 

descendants of SMGs. 


by applying the SED fitting code MAGPHYS [assuming ajChabrierj 
( |20^ IMF] to the ALESS ( [Hodge et aI.|20T3t SMG sample. 

Our predicted stellar masses lie at the lower end of values 
quoted in the literature however, it is difficult to understand the 
significance of the (dis)agreement. The comparison is further com¬ 
plicated by the top-heavy IMF for starbursts assumed in the model. 
We therefore consider a proper comparison of the stellar masses of 
SMGs predicted by our model and those inferred from observations 
to be beyond the scope of this paper, and caution the reader against 
over-interpreting the values presented briefly here. 

The predicted masses presented in Fig.|^are qualitatively sim¬ 
ilar for the fainter populations, though they systematically shift to 


4 ANGULAR CLUSTERING AT 850 fim 

The simplest measure of clustering from a galaxy imaging survey 
is the angular two-point correlation function w{9). Analogously to 
equation l [T), t he probability of finding two objects separated by an 
angle d > (Ijis defined as: 

5Pi 2{9) = rf[l + w{9)]SQ,iSQ2, (6) 

® Analogously to the spatial case, at 6 = 0 the correlation function is 
described by a Dirac delta function, S^{9)/r}, which is referred to as the 
shot noise term. 
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Figure 8. The predicted angular correlation function for 850 fim selected 
galaxies (uig) with Sgsorim > 0.25,1 and 4 mJy (blue, red and green lines 
respectively). Also shown is the angular correlation function for sources 
with Sgsorim > 4 mJy extracted from simulated single-dish sub-mm imag¬ 
ing produced with a 15 arcsec FWHM Gaussian beam (magenta line) with 
the corresponding shaded region indicating the hr (16 — 84*** percentile) 
field-to-field variation over 50 lightcone realisations of 4 deg^ each. For 
reference, the diagonal dotted and dash-dotted lines show two m oc 6 ^“^ 
power laws, with 7 = 1 . 8 , offset from each other in amplitude by a factor 
of 4. 


where 77 is the mean surface density of objects per unit solid an¬ 
gle and is a solid angle element, such that w{9) represents the 
excess probability of finding objects at angular separation 9, com¬ 
pared to a random (Poisson) distribution. 

In this section we present the angular correlation function of 
galaxies, irtg, selected by their 850 nm emission. We compare this 
to the angular correlation function of sub-mm sources, Ws, ex¬ 
tracted from simulated single-dish 850 fj-m imaging following the 
method presented in |Cowley et al.|j2015^ , and the angular correla¬ 
tion function of 850 /rm intensity fluctuations, wi. 


4.1 The Angular Clustering of Galaxies 

Angular clustering, ui(9), can be thought of as the on-sky projec¬ 
tion of ^(r, z), weighted by the number density of selected objects 
at a given redshift. We therefore use the approximation of |Limber| 
1 1953^ to calculate Wg{9) from ^(r, z), the spatial two-point cor¬ 
relation function. This assumes that the selection function (redshift 
distribution) of galaxies changes slowly over the comoving separa¬ 
tions r for which ^(r, z) is appreciably non-zero. Assuming a flat 
cosmology (as we do throughout), this allows Wg(9) to be related 
to ^ (r, z) by 


Wg(9) 


fN(zf^dzfdn^(r,z) 

lfN(z)dzr 


( 1 ) 


where N{z) is the predicted redshift distribution of the selected 
galaxies, dz/dx = HqE{z)/c with E{z) = + zf + 

X corresponds to the comoving radial distance to red¬ 
shift z. The comoving line of sight separation u is defined by 
r = [u^ + where ud(2 = [1 — cos(0)]. We present 

Wg for our sub-mm selected galaxy populations, as defined in the 
previous section, in Fig.[^ 


4.2 The Angular Clustering of Single-Dish Sources 


To make predictions for the angular clustering from sub-mm 
sources that would be observed in single-dish surveys we simu¬ 
late such observations using the method presented in|Cowley et al.| 
| |20T5) . 

Briefly, we generate light cone catalogue s of simu lated SMGs 
using the method described in 


Merson et al. 


1 20131' 


We include 

in our lightcone catalogue galaxies brighter than the flux at which 
90% of the predicted CIB at 850 /rm is recovered. The predicted 
value of the CIB is in good agreement with the observations of 
|Fixsen et al.| jl998^ , and thus gives our image a realistic back¬ 
ground. The galaxies are then binned into pixels according to their 
on-sky position, with the flux value of a pixel being the sum of the 
fluxes of all the galaxies within it. The pixel scale is chosen such 
that the beam is well sampled. This image is then smoothed with 
a Gaussian with a FWHM chosen to be equal to that of the beam 
used in observational studies following which Gaussian white noise 
is added of a magnitude comparable to that found in observations. 
The image is constrained to have a mean of zero by the subtrac¬ 
tion of a uniform background, and then matched-filtered prior to 
source extraction. Sources are found by iteratively identifying the 
maximal pixel in the map and subtracting off the matched-filtered 
PSF scaled to and centred on the value and position of the pixel. 
For simplicity the position of the source is recorded as being at the 
centre of the identifying pixel. The result of this source extraction 
is referred to hereafter as our source-extracted catalogue. 

Here we choose to make predictions for the 850 /rm SCUBA-2 
Cosmology Legacy Survey (S2CLS, e.g. |Geach et al.||2013) >, as 
measuring the clustering of SMGs is one of the main survey goals. 
For this reason we choose a Gaussian beam with a FWHM of 
15 arcsec (similar to that of the SCUBA-2/JCMT configuration at 
850 /rm). In order to estimate field-to-field variation we generate 
50 X 4 deg^ randomly orientated lightcones. We add instrumen¬ 
tal Gaussian white noise with ainst = 1 mjy/beam, which gives 
our maps a total noise of crtot « 1.2 mJy/beam, calculated from a 
pixel histogram of our image. This broadening of the noise distri¬ 
bution is due to the confusion noise from faint unresolved galaxies 
in the image, as atot ~ extract sources down 

to 4 mJy (~ 3.5a) as this is the typical limit at which sources are 
extracted out of single-dish surveys (e.g.|Coppin et al.|2006||Wei6| 
let al.|2009| >. 


To calculate Ws for our source extracted catalogue we use the 
angular equivalent of equation To ensure we are not affected 
by noise in the random catalogue, we generate random catalogues 
using the same selection function as for the data (i.e. same survey 
geometry) but with 250 times the number of points as there are 
sources for each of our simulated surveys. 

In estimating Ws(9) for each of the 50 lightcone realisations 
we used the actual number of sources in each field to calculate the 
mean surface density in order to match what is done observation- 
ally, rather than the true mean surface density. This causes the mean 
angular correlation function to be underestimated by an average 
amount 


fr^ = ^ J WJtrue(6*) dHidH2, (8) 

( |Groth & Peebles|1977^ due to the integral constraint (that by con¬ 
struction the estimated angular correlation function will integrate to 
zero over the area of the field), where Wtiuef) is the true angular 


^ This does not include any treatment of gravitational lensing. 


© 2015 RAS, MNRAS 000, [TpT] 
































12 


W. I. Cowley et al. 



Figure 9. Comparison of the predicted angular correlation function for our 
‘S’ssOAim > 4 mJy simulated single-dish source catalogue, Ws (magenta 
line), to observational estimates from Scott et al. J2006| filled squares) and 
Wei6 et al. j20Q9| open circles). The shaded magenta, cyan and orange re¬ 
gions indicate the 2cr (2.25 — 97.75^^ percentile) range derived from the 
field-to-field variation over 50 lightcone realisations for fields of 4, 1 and 
0.5 deg^ respectively. 


correlation function of the sources and the angular integrations are 
over a field of area 12. This quantity is related to the field-to-field 
variation in the number counts through 


(9) 


^2 ^ ^ 

(e.g. |Efstathiou et al.|1991| l. We evaluate equation for our 50 x 
4 deg^ lightcones and find = 4.8x 10 which we add onto our 
computed angular correlation functions for sub-mm sources {Ws)- 
In Fig.j^we show the mean Ws{ 0 ) from the 50 lightcone real¬ 
isations (magenta line), with the corresponding shaded region indi¬ 
cating the la (16 —84‘*' percentile) field-to-field variation in Ws{ 0 ) 
in each bin of angular separation. In Fig. M we compare Waid) with 
observational estimates from the 0.35 de^ LFSS source catalogue 
(WeiB et al. |2009| 19 arcsec FWHM, S'sso/jm ^4.5 mJy); and from 
sources identified from a compilation of non-co ntiguou s SCUBA 

15 arcsec 


2006 


fields totalling ~ 0.13 deg in area (Scott et al. 

FWHM, SsbOtim Y 5 tnJy). The magenta, cyan and orange shaded 
regions indicate the 2cr (2.25 — 97.75**' percentile) field-to-field 
variation in each bin of angular separation we predict for fields of 
4, 1 and 0.5 deg^ respectively, which must be considered when 
comparing theory and observations. For this we recalculate the an¬ 
gular correlation function for each field considering only sources 
within the central 1 or 0.5 deg^. As in Fig.[^ the large error bars 
of the observational data make a detailed comparison difficult and 
highlight the need for larger sub-mm surveys. We note however, 
that our predictions are consistent with the data once field-to-field 
variations are taken into account. 


4.3 Blending Bias in the Angular Clustering of Single-Dish 
Sources 

One of the key results of this work, evident in Fig. is that the 
angular correlation function of sources, Wa, is greater in amplitude 
by a factor of ~ 4 than the angular correlation function of galax¬ 
ies, Wg, for the source flux limit used here (4 mJy). In this section 
we investigate the dependence of this effect on a number of factors 
and conclude that is to due to confusion in the simulated survey 
caused by the 15 arcsec FWHM beam blending the emission of 



Figure 10. The effect of beam-size, instrumental noise and the clustering 
of faint (SssOrim < 2 mJy) galaxies on the angular correlation function of 
brighter (Ssso^im > 4 mJy) single-dish sources. The green and magenta 
lines show the angular congelation function for galaxies and sources (for 
a 15 arcsec beam) respectively, as shown in Fig. The vertical dashed, 
and diagonal dashed and dash-dotted lines, shown for reference, are also 
as described in Fig.j^ Upper panel: The dotted (dashed) orange line indi¬ 
cates the correlation function for sources extracted from simulated images 
generated with a 30 (7.5) arcsec beam. The light blue line is the source cor¬ 
relation function derived from images created with no ‘instrumental’ noise 
added. Lower panel: The dotted orange line indicates the correlation func¬ 
tion for sources extracted from images where the position of galaxies with 
'S’ssOfrm < 2 mJy and z > 2.5 were randomised prior to creation. The 
orange dashed line shows the same for images which had the position of all 
galaxies with Sgso^im < 2 mJy randomised. 


multiple, typically physically unassociated galaxies ( |Cowley et al.| 
|2015| l, with an on-sky separation comparable to or less than the 
size of the beam into an object recognised as a single source by 
the source extraction algorithnj^ Thus the angular distribution of 
sources found in the simulated map is different from the angular 
distribution of the input galaxies. We label this effect ‘blending 
bias,’ 6b, where = [lOs {6)/wg{9)], and note that a similar effect 
has been observed in low resolution X-ray surveys (e.g. |Vikhlinin| 
|& Forman|1995[|Basilakos et al.|2005l >. 

In the upper panel of Fig. 10 we test how sensitive this bias is 
to the size of the beam and ‘instrumental’ noise. We repeat the cal¬ 
culation for deriving the angular correlation function of single-dish 
sources for images generated using Gaussian beams with FWHM 
of 30 and 7.5 arcsec. We kept the instrumental noise constant at 


® In |Cowley et al.|j2015) we showed that this confusion effect boosts the 
cumulative 850 number counts by a factor of ~ 2 at SgsoMm = 4 mJy 
for a 15 arcsec FWHM beam. See also|Hayward et al.|^2013[ and|Munoz| 
|Arancibia et al.|(2Q15) who investigate the effect of coarse angular resolu¬ 
tion on the observed sub-mm number counts. 
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Figure 11. The effect of the redshift interval considered on the angular 
congelation function of Sgso/im > 4 mJy single-dish source counterparts 
(see text). Upper panel: Angular conelation function of single-dish source 
counterparts (magenta lines), Sgso/jm > 4 mJy galaxies (green lines) and 
dark matter (black lines) for the redshift interval 2.25 < z < 2.75 (solid 
lines) and 1.0 < z < 4.0 (dashed lines). Bottom panel: Evolution of 
large scale bias with redshift. Green squares indicate the bias evolution of 
‘S’ssOMm > 4 mJy galaxies, derived from the spatial correlation function as 
in Fig. The dotted, dashed and dash-dotted lines indicate the bias evolu¬ 
tion of halos with Muaio > 10 ^^, 10 ^^ and 10 ^^ h~^ Mq respectively. 
The green bars indicate the bias derived from the angular correlation func¬ 
tions of galaxies and dark matter, with the width of the bar indicating the 
redshift interval considered. The magenta bars show the same but for bias 
derived from the angular correlation functions of single-dish source coun- 
terparfs. The vertical grey line indicates z = 2.5, on which all redshift 
intervals considered are centred. 


cTinst = 1 mjy/beam in each case and used the same flux limit 
of S'ssoMm > 4 mJy to select our sources, noting that varying the 
beam size will change the confusion in the image and thus the over¬ 
all noise. We derived blending bias factors in Ws of ~ 2 and 
~ 8 for the 7.5 and 30 arcsec beams respectively. We tested 
the effect of instrumental noise by creating a set of images with 
a 15 arcsec beam, but without the addition of instrumental noise. 
This can be seen in Fig.[T^to have a negligible effect on the angu¬ 
lar correlation function of the sources, as one would expect given 
that our ‘instrumental’ noise is random and has no dependence on 
scale. 

In the lower panel of Fig.[T^we repeat the calculation on im¬ 
ages which had the positions of galaxies with Ssso/jm < 2 mJy and 
z > 2.5 randomised prior to being created and find that the blend¬ 
ing bias is reduced to 6^ ~ 2. For maps which had the position of 
all galaxies with Sgso/jm < 2 mJy randomised the blending bias 
is approximately unity i.e. has been removed. Although not shown 
in Fig. [T^ we also tested this effect on a set of images which had 
the positions of all galaxies randomised prior to their creation, and 
observed a result consistent with the selected sources being com¬ 



Figure 12. Angular cross coiTelations between two separated redshift inter¬ 
vals, ZA = [1.0, 2.4) and zb = [2.6,4.0). In the legend ‘Sources’ refers 
to the counterparts of sources (see text) extracted from our simulated imag¬ 
ing with SgSO/jm > 4 mJy and ‘Galaxies’ refers to galaxies selected with 
'S’gso/rm > 2 mJy. Top panel: We show the angular cross correlation of: (i) 
Source counterparts in za with source counterpaifs in zb (blue line); (ii) 
Source counterparts in za with galaxies (Sgso/jm > 2 mJy) in zb (green 
line); (iii) Source counterparts in za with galaxies (Sgsorim > 2 mJy) in 
Zb but with the sources extracted from images where the positions of galax¬ 
ies with Sgsojim < 2 mJy and z > 2.5 were randomised prior to creating 
the images (red line); and (iv) Galaxies (Sgso^jm > 2 mJy) in za with 
galaxies (Sgso/jm > 2 mJy) in zb (cyan line). The vertical dashed, and di¬ 
agonal dashed and dash-dotted lines, shown for reference, are as described 
in Fig.[^ Bottom panel: As for top panel but with a linear j/-axis. A dashed 
line at m = 0 has been added for reference. 


pletely unclustered. We conclude that blending bias in the angular 
clustering of single-dish sources is due to the confusion noise or 
rather the clustering of faint unresolved galaxies and the way in 
which, when their emission is smoothed with a single-dish beam, 
this causes certain on-sky positions to be selected as sources. It thus 
depends on the combined effect of the finite beam size, the intrin¬ 
sic clustering of the underlying galaxies, and their intrinsic number 
counts. 

We also consider how calculating the angular correlation func¬ 
tion using different redshift intervals can affect the blending bias. 
In order to assign a redshift to a single-dish source we first de¬ 
fine a source-counterpart as the galaxy which is contributing the 
most sub-mm flux to a source, taking into account the profile of the 
beam. We can then select these counterparts within a given redshift 
interval and recalculate the angular correlation function, now using 
the on-sky position of the counterpart. For the underlying galax¬ 
ies and dark matter we calculate the angular correlation function 
over a given redshift interval by appropriately changing the limits 
in the [Lirnb^ ( | 195 3 ^ equation l|^. An example of this is shown 
in the upper panel of Fig.[TT|for two redshift intervals centred on 
z = 2.5, 2.25 < z < 2.75 (solid lines) and 1.0 < z < 4.0 
(dashed lines). In this way we can derive a large-scale bias, defined 
as [w{9)/w-Dyi{d)Y^'^, for the galaxies and source-counterparts, 
as a function of redshift interval considered. This is shown in the 
bottom panel of Fig. [m where we consider 8 redshift intervals 
of varying width centred on z = 2.5. We can see that the de¬ 
rived source-counterpart bias, which is affected by blending bias, 
increases monotonically as the width of the redshift interval in- 
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creases whilst the bias derived from the angular correlation func¬ 
tion of galaxies is approximately constant and consistent with the 
bias derived from the spatial correlation function (see Section !?!^ 
for all redshift intervals considered. Also evident in this panel is 
how the halo mass can be significantly over-estimated as a result 
of this effect. As a further example of this, using equation ( 8 ) in 
ISheth, Mo & Tormen| j2001t to infer halo mass from a measured 
bias, we find that doubling the bias (i.e. bb = 2 ) of halos with mass 
10 ^^ h~^ Mq yields an inferred halo mass of h~^ Mq at a 

redshift of 2.5, an over-estimation of more than an order of magni¬ 
tude. 

To further illustrate the results in this section we imagine a 
simplified scenario with two distinct redshift intervals A and B 
and two angular positions 9^ and Within each redshift inter¬ 
val the positions of galaxies will be correlated according to some 
w{Si, S 2 , z ± Az, \9i — 02 l)> we define some flux limit Sum 
brighter than which galaxies will be resolved as point sources in the 
beam-smoothed imaging and fainter than which they would require 
some boost to be counted in the single-dish catalogue. 

If we now consider the effect of the beam, we have a beam- 
smoothed flux density field in each redshift interval, 5(flbeam, 2 ± 
Az,9), dominated by galaxies with S < Sum, the distribution 
of which will be correlated with the positions of galaxies with 
S > S'lim in that interval, according to w. It is also now possi¬ 
ble for flux from B to boost objects (at the same on-sky position) 
in A into the selection (and vice-versa). This induces an artificial 
cross-correlation between the sources selected in A and B, as some 
objects in B required a flux boost from A to be considered and this 
flux is correlated with selected objects in A. Thus we make the pre¬ 
diction that the cross-correlation of single-dish source counterparts 
(for sources with Sgso^jm > 4 mjy) in distinct redshift intervals 
will be non-zero, even in the absence of effects such as gravita¬ 
tional leasing which are not considered here. 

This is demonstrated in Fig. where we show the angular 
cross correlation between source counterparts in two distinct red¬ 
shift intervals 1.0 ^ z < 2.4, za, and 2.6 2 ; < 4.0, 2 b (blue 

line). This is found to be non-zero whilst the eqmvalent calcula¬ 
tion for bright galaxies (with Sssoiim > 2 mj}!^ is zero (cyan 
line). We also find that source counterparts in 2a are correlated 
with bright galaxies in 2b, in this case shown for galaxies with 
Sssoiim > 2 mJy (green line). The physical correlation of the faint 
with the bright galaxies in 2b has caused the sources from 2a, 
many of which were selected as sources because of a flux con¬ 
tribution from faint galaxies in 2b, to be correlated with bright 
galaxies in 2b. This is an induced correlation introduced by the 
finite beam. When we repeat the source-galaxy cross-correlation 
using sources from maps which had the positions of galaxies with 
'S'ssoMm < 2 mJy and 2 > 2.5 randomised prior to the image 
being created, the randomisation removes the physical correlation 
between faint and bright galaxies in 2b, thus we find that the in¬ 
duced cross-correlation between sources in 2a and bright galaxies 
in 2b, on scales larger than the beam, is now zero. This is despite 
the fact the positions of galaxies with S'aso/jm > 2 mJy in 2 b were 
not changed. 

We infer that it is these induced cross-correlations that cause 
the trend in blending bias with redshift interval width seen in the 


lower panel of Fig. [m as increasing the redshift interval increases 
the number of induced cross-correlations considered. It also ex¬ 
plains the trends seen in the lower panel of Fig. [T^ as randomis¬ 
ing the positions of faint galaxies reduces the correlation between 
the distribution of flux density, 5, and the distribution of galaxies 
with S > S'lim at a given redshift, and thus the contribution of the 
induced cross-correlation terms. For the same S'lim increasing the 
beam-size will on average increase the multiplicity of sources. As 
the components of each source are, in our simulations, drawn from 
different redshift intervals (galaxies composing a single source are 
generally at different redshifts) this means that for each source 
more induced cross-correlation terms are considered, producing the 
trends seen in the upper panel of Fig.|10[ 

We therefore caution that significant modelling is needed to 
interpret the angular correlation function of sources identified in 
single-dish surveys, at flux limits at which the sources are con¬ 
fused (i.e. composed of multiple fainter galaxies). The implication 
is that the halo masses of the galaxies in question could be seriously 
overestimated if blending bias is not corrected for. It appears from 
Fig. 13 that wi{9), described in the next section, exhibits angu¬ 
lar clustering more representative of the underlying galaxy popula¬ 
tion. We suggest then that information regarding the halo masses of 
SMGs should be inferred from ti)i( 6 ). This comes with the impor¬ 
tant caveat that the effects of correlated noise in observed images, 
e.g. large-scale structure due to correlated atmospheric contamina¬ 
tion and 1 // noise, need to be removed or accurately modelled. 

Targeted follow-up of single-dish sources with interferome¬ 
ters could also be used to overcome blending bias, as the order of 
magnitude better resolution would allow the underlying galaxies 
from which the sources are composed to be identified, down to flux 
limits dependent on integration time. This would provide an ap¬ 
proximately complete flux-limited catalogue of galaxies down to 
slightly above the source-extraction limit of the single-dish survey 
(some galaxies are de-boosted by instrumental noise to below the 
flux limit of the single dish survey and are therefore missed from 
the follow-up observations, e.g. |Karim et al.||2013| |Cowley et al.| 
|2015[ l which could then be used to derive the correlation function 
free from blending bias. 


4.4 The Angular Clustering of Intensity Fluctuations 

In this sectiorP^ we calculate the angular clustering of intensity 
fluctuations in our simulated images, 'u;i(0). We first introduce this 
quantity before describing how it is calculated in this paper. It can 
be defined as 

m,)i{e,)) = {if[i + w,{6)], (10) 

where /(^j) represents the intensity in a given direction 9^, 9 = 
— 621 iC Ihe mean intensity, which can be calculated 
from the number counts of our model by 

{I) = ls^dS. (11) 


^ Here we use a limit of 2, rather than 4 mJy, so we have enough objects for 
a robust determination of it^cross- We do not expect the result to be sensitive 
to this given that the auto-correlation of galaxies is roughly independent of 
flux over this flux range. 


In this section, for ease of reading, and as here we are only considering 
a single band (850 /im), we suppress the explicit frequency dependence in 
our notation. For example, we write the mean intensity at a given observed 
frequency u, {1^), as (7). 
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Figure 13. Predicted angular auto-correlation functions. The angulai* corre¬ 
lation function of the 850 //m intensity fluctuations, derived from the angu¬ 
lar power spectrum of the simulated single-dish imaging, prior to matched 
filtering is shown by the gold line. The gold shaded region indicates the la 
(16 — 84*^ percentile) field-to-field variation over 50 lightcone realisations 
of 4 deg^ each. The grey dotted line indicates the expectation for the an¬ 
gular correlation function of the CIB intensity fluctuations if the galaxies 
contributing to it were unclustered. All other lines are as described in Fig.[^ 



Figure 14. Predicted differential emissivity of our model at 350 fim for a 
range of redshifts, as indicated in the legend. The contribution from central 
(central -i- satellite) galaxies is shown using dotted (solid) lines. 


The function wi{0) can be expressed as a flux-weighted integral of 
the angular correlation function of galaxies, Wg, such that 


wi( 0 ) = 


1 

W 


11 Wg(Si,52,0)5i52^^d5id& 

( 12 ) 


where WglSi, 82 , 0 ) is the angular cross-correlation of galaxies 
with fluxes and S 2 and dri/dSi is the surface density per unit 
solid angle of galaxies with flux Si. The angular cross-correlation 
of galaxies w^{Si, 82 , 0 ) derives from a more general form of 
equation a such that 


Wg(Sl,52,e) 


/ Ni{z)N 2 {z)^dz f du^{Si,S 2 ,r, z) 
J Ndz)dz j N 2 {z)dz ■ 


(13) 


where Ni{z) represents the redshift distribution of galaxies with 
flux Si and ^{Si, 82 , r, z) is the spatial cross-correlation of galax¬ 
ies with S\_ and 82 , at redshift z. We can recover Wg for an indi¬ 
vidual galaxy population by integrating Wg{Si, 82 , 0 ) over the flux 
limits defining the selection of the population. The term containing 
the Dirac delta function 5° (0) on the right hand side of equation 
GH is the shot noise, which arises from galaxies being approxi¬ 
mated as point sources. 

We can calculate wi for the clustered galaxy population di¬ 
rectly from our simulated images using the estimator 


wi(6») 


T,ij SrSjQjj 


(14) 


where Si is the fractional variation of flux in the i**' pixel and is 
calculated using Si = {Si/{ 8 }) — 1 where Si is the flux value of 
the pixel and { 8 ) is the average flux value of a pixel, as all of 
our pixels are of equal area. The step function Qij is 1 if pixels 
i and j are separated by a distance in the angular bin 0 ± /S.0/2 


and zero otherwise. However, in practice it is more computation¬ 
ally efficient to make use of the fact that wi can be obtained from 
the angular power spectrum of CIB anisotropies. Pi (kg), using a 
Fourier transform such that 

27r r 

Wi{0) = J Pi{kg)jQ{2nkg0)kgdkg, (15) 

where Jo is the zeroth order Bessel function of the first kind and 
the convention kg — 1 /Xg is usecj^ We therefore compute Pi{kg) 
directly from our simulated images, prior to any matched-filtering, 
and make use of equation GD to calculate wi. This quantity is 
shown in Fig. [T^ (gold line), with the corresponding shaded re¬ 
gion indicating the la percentile variation of our 50 lightcone re¬ 
alisations at a given 0. The Gaussian-like profile on small scales 
(0 < 30 arcsec) is due to the beam used to convolve the simulated 
image and is mostly produced by the shot noise term in equation 
GH- It can be seen that on scales larger than the beam wi is very 
similar to trig, which is unsurprising given that ~ 70% of the total 
background light predicted by the model at 850 ^m is in galaxies 
with S'asoA.m > 0.25 miy. 


5 ANGULAR POWER SPECTRUM OE CIB 
ANISOTROPIES 

The galaxies which contribute to the bulk of the CIB cannot be indi¬ 
vidually resolved with current instruments, and instead information 
regarding their clustering and hence the masses of the halos they oc¬ 
cupy is derived from observations of the clustering of fluctuations 


We use this convention as it is the standard practice for angular power 
spectra of CIB anisotropies (e.g. [Gautier et al.|199^ |Viero et ^|20Q9^ . 
Under this convention the angular wavenumber is related to the multipole 
index, £,by £ = 27rk0 (when angles are measured in radians). 
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Figure 15. Angular power spectra of CIB anisotropies predicted by our model at 250, 350 and 500 fim (left to right panels). The blue solid line indicates 
the power spectrum averaged over 3 randomly orientated lightcones, each with an area of 20 deg2. The dashed blue lines in the left panel indicate the power 
spectra for each of these fields individually. The horizontal dashed line shows the predicted shot noise contribution to power spectra. The dashed red line shows 
the prediction of our model after the fluxes of our simulated galaxies have been rescaled (see text). We compare to the observational data of Veiro et al. j2013| 
squares) with the filled and open squares coiTesponding fo different levels of masking, and to that of the Planck Collaboration et al. j2Q14| triangles). 


in the background light. Therefore, in this section we compare pre¬ 
dictions with recent measurements of the angular power spectmm 
of CIB anisotropies P{(kg). Here v is a fixed observed frequency 
[related to the emitted frequency, v^,'by v = Vg{l -i- 

The angular power spectrum of CIB anisotropies was intro¬ 
duced in equation and can be expressed as an integral over 
redshift of the 3D power spectrum of fractional emissivity fluctu¬ 
ations Vj (k, z), where spatial wavenumber k is related to spatial 
wavelength A by the convention k = 2 tvIX. Using the approxima¬ 
tion of |Limber| ( |1953| l, the small-angle approximation (kg 1) 
and assuming a flat cosmology, we can write (kg) (for kg in 
units of radians”^) as 

(j„(z)fVj{k = 2nkg/x,z) (16) 

(e.g. [Viero et ^|2009[ |Shang et al.||2Q12^ . Here x is the radial 
comoving distance to redshift z, a = (1 -I- a)”^ is the cosmologi¬ 
cal scale factor and {jv{z)) describes the mean emissivity per unit 
solid angle at redshift z, which can be expressed as 


(j.C)) - j 


(17) 


and related to the mean intensity (see equation [TT) by 

{h) = j dz^a{j.{z)). (18) 


In our model, not all halos contribute equally to {j^{z)}. We 
can therefore define a differential emissivity djT/dlogj^Q Mh (e.g. 
[Shang et al.|2012] [Bethermin et al.|2013^ such that equation l[^ 
can be expressed as 


prikg) = 


III 


dz d logiQ Mh d log^o Mh 


fdX f a 


dz \x 




( 19 ) 


d logip Mh d logio Ml 


where Mh, Mh, «) is the 3D cross-spectrum of fractional 

emissivity fluctuations, between halos of mass Mh and Mj'. 

Whilst in principle it is possible to calculate {ju(z)) and 
Pj (k, z) from the output of our model, for simplicity we com¬ 
pute Pi (kg) from a simulated image of a lightcone catalogue at 
the wavelength of interest. 

Here, as we compare Pi (kg) predicted by the model to recent 
Herschel-SPIRE data ( [Viero et al.||2013) , we use wavelengths of 
250, 350 and 500 /rm, and a Gaussian beam with a FWHM of 18, 
25 and 36 arcsec respectively, to create our imaging. For simplic¬ 
ity we do not add any instrumental noise to these maps. Following 
the procedure outlined earlier we generate a lightcone catalogue 
including galaxies brighter than the flux at which we recover 90 
percent of the predicted CIB at the wavelength of interest (this pre¬ 
dicted CIB agrees well with the observations of jFixsen et al.|(T998) ( 
at all wavelengths) and choose a pixel scale such that the beam is 
well sampled. We generate 3 x 20 deg^ lightcones in order to have 
a similar total area to that used by Viero et al. 

First, we show the differential emissivity of our model (de¬ 
scribed above) at 350 nm in Fig. [T^ in terms of the contribution 
from central and satellite galaxies. The contribution from central 
galaxies peaks in the halo mass range — 10^^ h~^ Mq at 

all redshifts, with the peak evolving modestly from lower to higher 
halo masses from z = 5 to 2 = 2, and then being approximately 
constant for 2 < 2. The contribution from satellite galaxies spans 
a broader range of halo mass and peaks at higher halo mass, how¬ 
ever, it is much smaller than that of the central galaxies, being only 
~ 6% of the total 350 /j-m emissivity at 2 = 3.1 and only ~ 14% 
at 2 = 0.5. 

In Fig.f^we compare Pj" (kg) predicted by our model to the 
observations of [Viero et al.| ( |20T3t . The horizontal dashed line in 
each panel represents the predicted shot noise. This is the power 
that would be expected if the background were composed of an 
un-clustered population of point sources and as such has no scale- 
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Figure 16. An example of our flux re-scaling scheme at 350 /im. Top panel: 
Predicted number counts (blue line) showing the contribution to the counts 
from starburst and quiescent galaxies (dotted and dot-dashed lines respec¬ 
tively). The red dashed line shows the number counts after the flux rescal¬ 
ing has been applied. Observational data are taken from Clements et al. 
j2Q10| op en squares), Oliver et al. l |2010| open triangles) and Bethermin et 
al. l |2012| filled squares). Bottom panel: The flux rescaling applied to simu¬ 
lated galaxies as a function of original model flux. A horizontal dotted line 
is drawn at unity for reference. The vertical dashed line in both panels in¬ 
dicates a flux of 50 mJy, the limit brighter than which we do not include 
galaxies in our image in order to match the masking applied by Viero et al. 

(20T3) . 


dependence. It is related to the number counts of the model by 



Sf^dS., 


( 20 ) 


(e.g. |Tegmark & Efstathiou|1996l l, where S'cut is the limit above 
which sources can be resolved and are therefore removed/masked 
from further analysis in order to reduce the shot noisj^ Note that 
this contribution to the power spectrum corresponds to the Dirac 
delta function term in equation in}. 

We show the two extremes of masking schemes applied by 
Viero et al. to their data, in order to reduce the shot noise in their 
images. They identified sources by finding peaks > 3a in the 
matched-filtered SPIRE images at each wavelength. Sources above 
a given flux limit (5cut) were then masked by circles with a 1.1 x 
FWHM diameter, before calculating the power spectra. Extended 
sources were removed by using the criterion S'cut = 400 mJy. We 
compare to the most extreme masking case Scut = 50 mJy (open 


Imposing the limit Scut is necessary a s for Euclidean number counts 
(dr;/dS oc S“^ ®) the integral in equation j20| does not converge. 


squares) and mimic the masking applied by | Viero et al.| ( [20T3) l by 
excluding galaxies with S^ > 50 mJy prior to the creation of our 
simulated images. We have tested that masking pixels in the full 
image produces near identical results. 

At 350 and 500 /rm we also compare our predictions to the 
observational data of the Planck Collaboration (XXX, [20T4l l. These 
authors employ a slightly different masking scheme to that used 
by Viero et al., however this has a negligible effect on the scales 
covered by their data. Encouragingly, both observational datasets 
are in good agreement. 

We note that there is a discrepancy between the model predic¬ 
tions and the observational data of a factor ~ 2 over all wavelengths 
and angular scales. Whilst this represents much better agreement 
than for previous versions of our model (e.g. |Kim et al.|2012^ we 
investigate whether it is possible to further improve this by forc¬ 
ing a better agreement between our predicted number counts and 
those that are observed. By construction, this gives us the observed 
surface density of objects and should make the shot noise terms 
equal. This is merely an illustrative exercise to replicate one of 
the freedoms of empirical models which are constrained to match 
the observed counts e.g. HOD modelling. An example of this is 
shown in Fig.[T^ where we scale the fluxes of our galaxies by the 
function shown in the bottom panel, chosen such that it brings our 
model number counts into better agreement with the observed data 
(top panel). We then apply this scaling relation to our galaxies prior 
to the creation of our simulated images and recalculate the power 
spectrum, resulting in the dashed red line in Fig. [T^ This exercise 
produces power spectra in much better agreement with the observed 
data, even at low values of ke where clustering dominates over the 
shot noise. We recognise that this is an artificial adjustment to our 
model. However it is a relatively minor one as we do not adjust the 
flux of our galaxies by more than ~ 40% across all three bands. We 
do not draw strong conclusions from this, but simply note that good 
agreement with the observed number counts is required to repro¬ 
duce the observed power spectra. In this case we have adjusted our 
number counts artificially but in future this could be achieved by 
developments to the treatment of physical processes in the model. 

At 250 fim there remains a small (~ 25%) discrepancy be¬ 
tween the observed shot noise and that predicted by our flux rescal¬ 
ing, despite the fact that the number counts are in close agreement 
(~ 14%). We attribute this to field-to-field variation between the 
fields used to measure the observed number counts and those used 
for measuring power spectra, and the uncertainties on both mea¬ 
surements. 

As the FIR emissivity is dominated by a halo mass range of 
IQii-S _ ioi2 /j-i Mo (e.g. at 350 ^m and 2 = 3.1, 54% of the 
total emissivity comes from halos in this mass range) we investi¬ 
gate whether this mass range also contributes most to the angular 
power spectrum of CIB anisotropies. We retain the masking flux 
limit of S'cut = 50 mJy from Viero et al. and divide our light- 
cone catalogue into three halo mass bins of 0.5 dex width, which 
span the peak of the differential emissivity distribution shown in 
Fig.[T^ We then construct an image for each bin. The cross-power 
spectra for these images are shown in Fig. [T^ We have ignored 
the contribution from halos outside the mass bins chosen for this 
plot, however, the bins chosen contribute ~ 90% of the total power 
spectrum (for Saso^m < 50 mJy). We can see that the same halo 
mass bin which dominates the emissivity dominates the contribu¬ 
tion to the power spectrum, as one might expect if the fractional 
cross-power spectrum term, Vj {k, Mh, z), in equation | |19| is 
a smoothly varying function of halo mass, given the peaked nature 
of the dy„/dlogjQ Mh term. 
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Figure 17. Power spectrum of the CIB predicted by our model at 350 /rm for Saso/jm < 50 mJy (solid grey line) divided into the following halo mass bins 
lO'^^Mh ^ 10^^ ® h~^ M 0 , 10^^ ® ^ Afn < 10^^ h~^ Mq and 10^^ ^ M^^ < 10^^ ® h~^ Mq. The diagonal panels indicate the auto-power spectrum 
of the halo mass bin indicated in the panel. The off-diagonal panels indicate the cross-power spectrum between different bins, as indicated in the panel. The 
dashed grey horizontal line indicating the total shot noise for the Saso/jm < 50 mJy population. 


To investigate the fluxes of the galaxies which contribute most 
to the power spectrum, we divide our lightcone catalogue into four 
flux bins and construct an image for each. The cross-power spectra 
for these images shown for 350 fim in Fig.[T^ We can see immedi¬ 
ately that on larger angular scales (kg < 0.1 arcmin~^) the power 
is dominated by galaxies in the faintest bin < 5 mJy (e.g. top 
left panel), whilst the shot noise is dominated by brighter galaxies 
(e.g. bottom right panel). In our model the dominant shot noise con¬ 
tribution at 350 fim (for galaxies with S 350 Mm < 50 mJy) comes 
from galaxies with Ss^oum ~ 20 mJy. 


6 SUMMARY 

We present predictions for the clustering evolution of dusty star¬ 
forming galaxies selected by their total infra-red luminosity (Lir), 
and their emission at far infra-red (FIR) and sub-millimetre (sub- 
mm) wavelengths. This includes the first predictions for potential 
biases on measurements of the angular clustering of these galaxies 
due to the coarse angular resolution of the single-dish telescopes 
used for imaging surveys at these wavelengths. Our model incorpo¬ 
rates a state-of-the-art semi-analytic model of hierarchical galaxy 
formation, a dark matter only A^-body simulation which utilises the 


WMAP7 cosmology and a simple model for calculating the emis¬ 
sion from interstellar dust heated by stellar radiation, in which dust 
temperature is calculated self-consistently. 

We present predictions for the spatial clustering of galax¬ 
ies selected by the total infra-red luminosity for Lir ~ 10® — 
10^® h~^ Lq for 2 ; = 0 — 5. We find that the clustering evolu¬ 
tion in our model depends on the luminosity of the selected galax¬ 
ies. The large-scale bias evolution of our most luminous galaxies 
(10^®-10^®'® Lq) is consistent with them residing in halos of 
mass — 10^® h~^ Mq over this redshift range. In the model, 
this halo mass range is the one most conducive to star formation 
over these redshifts. For lower luminosity populations the range of 
halo masses selected changes with redshift, such that generally they 
move to higher mass halos with increasing redshift. 

We find that 850 /rm selected galaxies in our model repre¬ 
sent a clustered population, with an S'gsoMm > 4 mJy selected 
sample having a correlation length of ro = 5.51 q'| h~^ Mpc at 
2 = 2.6, consistent with observations of |Hickox et al.|j2012[ > and 
|Blain et S?| ( |2004^ . The bias with which they trace the dark matter 
evolves with redshift in a way consistent with the SMGs residing 
in halos of 10^^ ® — 10^® h~^ Mq up to a redshift of 2 ~ 4. 
This result is insensitive to the flux limit used to select the galax¬ 
ies for 0.25 < S'ssoMm < 4 mJy, and we note that even at the 
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Figure 18. Power spectrum of the CIB predicted by our model at 350 /rm divided into the following flux bins 0 ^ < 5 mJy, 5 Si Ssso^m < 10 mJy, 

10 Si Ssso^m < 20 mJy and 20 S Ssso/rm < 50 mJy. The diagonal panels indicate the auto-power spectrum of the flux bin indicated in the panel and as 
such contains the shot noise term, indicated by the horizontal dashed line. The off-diagonal panels indicate the cross-power spectrum between different bins, 
as indicated in the panel. The solid grey line in each panel indicates the total power for Si/ < 50 mJy, with the dashed grey horizontal line indicating the total 
shot noise. 


faintest fluxes investigated (S'sso/jm ^ 0.25 mJy) the model pre¬ 
dicted 850 /tm number counts are dominated by starburst galaxies. 
Interestingly, the halo occupation distribution for 850 /rm central 
galaxies peaks well below unity. Halo abundance matching mod¬ 
els which force the HOD of central galaxies to equal unity would 
place galaxies in much more massive halos than our model, given 
the same galaxy number density. We find further that our bright¬ 
est SMGs (5g5o,jm > 4.0 mJy) evolve into z — 0 galaxies with 
stellar mass ~ 10^^ h~^ Mq, occupying a broad range of present 
day halo masses lO'^^ — 10^'* h~^ M©. Thus, in our model, bright 
SMGs do not necessarily trace the progenitors of the most mas¬ 
sive 2 = 0 environments. Our Sgso/jm selected galaxy populations 
share significant overlap with the most infra-red luminous galaxy 


populations Lir ~ 10^^ h ^ L©, and thus exhibit similar cluster¬ 
ing evolution. 

We make predictions for the angular clustering of sub-mm 
sources identified in the SCUBA-2 Cosmology Legacy Survey. We 
show that the angular clustering of 850 fim single-dish selected 
sources is biased with respect to that of the underlying galaxy 
population, in our model by a factor of ~ 4. We attribute this 
‘blending bias’ to the coarse angular resolution of single dish tele¬ 
scopes blending the sub-mm emission of many (typically phys¬ 
ically unassociated) galaxies into a single source. This induces 
cross-correlation terms between sources selected at different red- 
shifts. The position of a galaxy at 2 a boosted into the source se¬ 
lection by fainter galaxies at some other redshift 2 b will thus be 
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correlated with the positions of galaxies at 2 b, some of which will 
already be included in the source selection. It is the addition of 
these induced cross-correlations that leads to the ‘blending bias’. 
The value of this bias depends on the size of the beam, the intrinsic 
clustering of the underlying galaxy population, and their number 
counts. 

We caution that this severely complicates the interpretation 
of measurements of the angular clustering of SMGs derived from 
single-dish survey source catalogues, and if not considered could 
lead to the halo masses for SMGs being significantly overestimated. 
The angular clustering of galaxies selected at 850 /rm in our model 
is insensitive to the flux limit used (as is the case for the spatial 
clustering), and agrees with the angular clustering of intensity fluc¬ 
tuations predicted by the model at that wavelength. 

The FIR emissivity of our model is dominated by the emission 
from halos in the mass range — 10^^ h~^ Mq independent 

of redshift, and this halo mass range also dominates the angular 
power spectrum of GIB anisotropies. Our model agrees with the 
observed angular power spectrum of GIB anisotropies at Herschel- 
SPIRE wavelengths (250, 350 and 500 ^m, Viero et al. |2013| ) to 
within a factor of ~ 2 over all scales, representing an improvement 
over previous versions of the model. This agreement can be further 
improved on by making minor (< 40%) artificial adjustments to 
the fluxes of our galaxies which bring the predicted number counts 
into better agreement with those observed. 

Galaxies selected by their FIR/sub-mm emission represent a 
large proportion of the cosmic star formation over the history of 
the Universe. As such, understanding the nature of these galaxies is 
critical to a full understanding of galaxy formation. In our model, 
the galaxies that contribute to the bulk of the GIB are predominantly 
disc instability triggered starbursts which reside in a relatively nar¬ 
row range of halo masses 10^^'® — 10®^ h~^ Mq for 2 < 5. 

Abundance matching arguments which combine the observed 
stellar mass function with the theoretically predicted halo mass 
function at 2 = 0 imply that this is also the mass range for present- 
day halos for which the conversion of baryons into stars has been 
most efficient (e.g. |Guo et al.|2010^ . The stellar fraction in a halo 
depends on an integral over the past history of star formation in 
all of the progenitors of that halo. In our model, the fact that the 
conversion efficiency of baryons into stars peaks in present day ha¬ 
los of mass ~ 10^^'® — 10^^ h~^ Mq is a simple consequence 
of most of the star formation occurring in such halos over a large 
range of redshifts (2 < 5), combined with the growth of halos by 
hierarchical structure formation. This in turn is a consequence of 
the physical prescriptions on which our model for galaxy forma¬ 
tion is based, in particular for gas cooling in halos and feedback 
from supernovae and AGN. Observationally, information regarding 
the host halo masses of selected galaxies can be derived from mea¬ 
surements of their clustering, however extracting significant results 
from observations at FIR/sub-mm wavelengths is a challenging ex¬ 
ercise. This work presents predictions which we hope will inform 
the interpretation of future observations. 
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